tests: cross-driver regression matrix harness#32
Merged
Conversation
Adds `tests/regress.py` — a manual-run Python orchestrator that compares
this project's userspace stack against the kernel-driver baseline for
both TX and RX on two plugged-in USB Wi-Fi adapters:
TX = devourer TX = kernel
RX = devourer end-to-end devourer does dvr RX a kernel-TX frame?
RX = kernel does dvr emit valid baseline / rig sanity check
frames?
Each cell injects/receives the canonical beacon (SA 57:42:75:05:d6:00,
matching txdemo/main.cpp) for --duration seconds and counts hits. A
cell passes if hits >= --pass-threshold. Output is a markdown table —
designed to paste into PR comments.
Pieces:
- `tests/regress.py` — matrix orchestrator. Auto-detects DUTs via
sysfs, handles per-cell kernel-driver bind/unbind, parses devourer
log output, supports --no-baseline-abort for partial-rig setups
where one chipset has no working kernel driver.
- `tests/inject_beacon.py` — standalone scapy injector for the
kernel-TX cells. Emits the same beacon WiFiDriverTxDemo uses, so
cross-driver SA matching works either direction.
- `tests/README.md` — usage, prereqs, distro-agnostic install hints,
VM-readiness notes (kernel-cell shell-outs all go through one
function — drop in `ssh trainer-vm sudo` to migrate the kernel
driver into a pinned-kernel VM when host upgrades start breaking
the out-of-tree aircrack-ng driver).
Portability: tool paths resolved via `which`, wlan iface names
discovered via `iw dev` (works for systemd's `wlp*` and classic
`wlan*`), kernel driver claiming each DUT read from sysfs (no
hardcoded module names). Preflight check prints actionable install
hints if anything's missing.
First-run validation on trainer-arch (Arch, kernel 6.x, 0bda:8812 +
0bda:8813 in a USB hub): the devourer-TX(8814) → kernel-RX(8812)
cell passed, proving devourer's RTL8814AU TX path (per #29) really
does emit frames the mainline rtw88 picks up. The remaining cells
correctly identified the rig's known limitations — mainline
rtw88_8814au can't probe this 8814AU dongle on this kernel (firmware-
download error -22), and 8814AU RX is a pre-existing TODO. README
explains how to interpret a partial matrix in that case.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
josephnef
added a commit
that referenced
this pull request
May 23, 2026
…el) (#33) ## What this is Adds a libvirt-VM execution mode to `tests/regress.py` so the kernel-side cells of the regression matrix can run against the `aircrack-ng/rtl8812au` out-of-tree driver on a **pinned kernel**, instead of fighting the host kernel. ## Why a VM The OOT `aircrack-ng/rtl8812au` driver lags kernel API changes by 6-12 months (timer_*, cfg80211 callback signatures with MLO link_id, etc.). On kernel 6.15+ it needs hand-patching to build. morrownr's README flags that mainline `rtw88_*` is now the recommended path from kernel 6.14 onwards — but **mainline `rtw88_8814au` currently fails to probe** RTL8814AU on this lab's adapter (`failed to download firmware`, `error -22`). So for 8814 specifically, OOT aircrack-ng is the only working kernel-side path. Pinning a VM to Ubuntu 22.04 LTS (kernel 5.15) gives a stable platform where aircrack-ng's driver builds and loads cleanly. The host can upgrade freely without breaking the test rig. ## Pieces **`tests/setup_vm.sh`** — one-shot VM provisioner. Clones an Ubuntu 22.04 cloud image (`jammy-base.qcow2`), generates a cloud-init seed (creates `dima` user with caller's SSH key, NOPASSWD sudo, installs build-essential / dkms / linux-headers / iw / tcpdump / python3-scapy / aircrack-ng), `virt-install`s with `qemu-xhci` USB controller for hot-plug, runs `make dkms_install` of `aircrack-ng/rtl8812au` inside via `runcmd`. ~5-10 min end to end. `--teardown` and `--status` subcommands included. **`tests/regress.py` refactor** — introduces a `KernelHost` abstraction owning every kernel-side operation (`modprobe`, sysfs reads, `iw`, `tcpdump`, scapy). Local mode = `subprocess.run`. VM mode = `ssh ... sudo` + `virsh attach-device`/`detach-device` for per-cell USB passthrough. New CLI flags `--vm-name` / `--vm-ssh` (env: `DEVOURER_VM_NAME`, `DEVOURER_VM_SSH`). When invoked under `sudo`, picks up `SUDO_USER`'s SSH key — root usually doesn't have keys provisioned on the VM. **Per-cell DUT routing** — each cell calls `_ensure_dut_location` for each DUT, which (in VM mode) moves the DUT between host and VM via virsh as needed. State always restored to \"both DUTs on host\" between cells via try/finally so a crashed cell doesn't poison the next one. Script start has a `release_all_known_duts` pass for leftover-attached DUTs from previous aborted runs. ## Validation on trainer-arch Arch Linux host kernel 6.18, VM Ubuntu 22.04 LTS kernel 5.15, two USB DUTs in a hub (0bda:8812 RTL8812AU + 0bda:8813 RTL8814AU): ``` ## Regression matrix — channel 100, 2026-05-23 13:22:14 - TX adapter: 0bda:8812 (RTL8812AU) - RX adapter: 0bda:8813 (RTL8814AU) - Kernel host: VM devourer-testrig via dima@10.216.129.126 - Cell duration: 10s - Pass threshold: ≥ 3 hits | | TX = devourer | TX = kernel | |---|---|---| | RX = devourer | 0 hits / 4500 TX ✗ | 0 hits / 258 TX ✗ | | RX = kernel | 4172 hits / 4500 TX ✓ | 229 hits / 259 TX ✓ | ``` - **Baseline ✓** kernel-TX 8812 → kernel-RX 8814 inside VM, **~88% delivery** - **devourer-TX validation ✓** devourer-TX 8812 on host → kernel-RX 8814 in VM, **~93% delivery** — confirms devourer's RTL8812AU TX really emits valid frames at the wire level - The two failing cells are the pre-existing devourer 8814 RX TODO, not regressions; cell 3's new \"0 hits / 258 TX\" output correctly fingers the RX side (TX side really did emit 258 frames; devourer-RX 8814 silent) For comparison: the same hardware in local mode from #32's first run got **1 hit** on the devourer-TX→kernel-RX cell because mainline `rtw88_8814au` couldn't probe the chip. The VM with aircrack-ng gives **~4000× the signal**. ## Smaller fixes folded in - TX-count parser surfaces \"Failed to send packet\" failure count separately from the rate-limited `<devourer-tx>` print count (previously misleadingly low when sends were failing) - `--no-baseline-abort` flag for partial-rig diagnostics - `wait_for_wlan_iface` timeout bumped to 20s (kernel rebinds + VM passthrough enumeration take 10s+) - Kernel-TX cells `wait()` for `inject_beacon` to self-terminate instead of killing the ssh wrapper — captures the final \"sent N frames\" line (previously TX count showed 0 even though RX side received frames) ## Usage ```bash sudo tests/setup_vm.sh # ~5-10 min, one-time sudo tests/setup_vm.sh --status sudo python3 tests/regress.py --channel 100 \ --vm-name devourer-testrig \ --vm-ssh dima@<VM-IP> ``` See [`tests/README.md`](tests/README.md) for full options, prereqs, architecture notes. ## Known limitations (documented in README) - VM mode assumes a single libvirt host running both `virsh` (locally) and the VM. Pulling the VM onto a different host needs your own `virsh` wrapper. - Per matrix run: ~3-4 min in VM mode (USB hot-plug adds ~5s per cell transition vs ~100s for local mode). - Two-adapter scope today. >2 needs a pairing loop in `main()`. - Cell 4 (`devourer-TX → devourer-RX`) needs both DUTs devourer-claimable simultaneously — if one chipset has broken devourer RX (current RTL8814AU TODO), that cell shows 0 regardless of TX. ## Test plan - [x] VM provisioning succeeds end-to-end (`setup_vm.sh` clean run on trainer-arch) - [x] aircrack-ng/rtl8812au DKMS install works inside VM (kernel 5.15) - [x] USB hot-plug of 8814AU into VM works (mainline rtw88 couldn't probe; aircrack-ng claims cleanly) - [x] Full 4-cell matrix runs end-to-end in VM mode - [x] Baseline cell passes (rig sanity) - [x] devourer-TX → kernel-RX cell passes (cross-driver validation) - [x] Failing cells produce diagnostic output (TX count vs RX hits) - [ ] Validate on a different distro / different VM base image - [ ] Validate with a 2× same-chip DUT setup (both cells with both-devourer pass) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
josephnef
added a commit
that referenced
this pull request
May 23, 2026
…el) (#33) ## What this is Adds a libvirt-VM execution mode to `tests/regress.py` so the kernel-side cells of the regression matrix can run against the `aircrack-ng/rtl8812au` out-of-tree driver on a **pinned kernel**, instead of fighting the host kernel. ## Why a VM The OOT `aircrack-ng/rtl8812au` driver lags kernel API changes by 6-12 months (timer_*, cfg80211 callback signatures with MLO link_id, etc.). On kernel 6.15+ it needs hand-patching to build. morrownr's README flags that mainline `rtw88_*` is now the recommended path from kernel 6.14 onwards — but **mainline `rtw88_8814au` currently fails to probe** RTL8814AU on this lab's adapter (`failed to download firmware`, `error -22`). So for 8814 specifically, OOT aircrack-ng is the only working kernel-side path. Pinning a VM to Ubuntu 22.04 LTS (kernel 5.15) gives a stable platform where aircrack-ng's driver builds and loads cleanly. The host can upgrade freely without breaking the test rig. ## Pieces **`tests/setup_vm.sh`** — one-shot VM provisioner. Clones an Ubuntu 22.04 cloud image (`jammy-base.qcow2`), generates a cloud-init seed (creates a user with caller's SSH key, NOPASSWD sudo, installs build-essential / dkms / linux-headers / iw / tcpdump / python3-scapy / aircrack-ng), `virt-install`s with `qemu-xhci` USB controller for hot-plug, runs `make dkms_install` of `aircrack-ng/rtl8812au` inside via `runcmd`. ~5-10 min end to end. `--teardown` and `--status` subcommands included. **`tests/regress.py` refactor** — introduces a `KernelHost` abstraction owning every kernel-side operation (`modprobe`, sysfs reads, `iw`, `tcpdump`, scapy). Local mode = `subprocess.run`. VM mode = `ssh ... sudo` + `virsh attach-device`/`detach-device` for per-cell USB passthrough. New CLI flags `--vm-name` / `--vm-ssh` (env: `DEVOURER_VM_NAME`, `DEVOURER_VM_SSH`). When invoked under `sudo`, picks up `SUDO_USER`'s SSH key — root usually doesn't have keys provisioned on the VM. **Per-cell DUT routing** — each cell calls `_ensure_dut_location` for each DUT, which (in VM mode) moves the DUT between host and VM via virsh as needed. State always restored to \"both DUTs on host\" between cells via try/finally so a crashed cell doesn't poison the next one. Script start has a `release_all_known_duts` pass for leftover-attached DUTs from previous aborted runs. ## Validation on trainer-arch Arch Linux host kernel 6.18, VM Ubuntu 22.04 LTS kernel 5.15, two USB DUTs in a hub (0bda:8812 RTL8812AU + 0bda:8813 RTL8814AU): ``` ## Regression matrix — channel 100, 2026-05-23 13:22:14 - TX adapter: 0bda:8812 (RTL8812AU) - RX adapter: 0bda:8813 (RTL8814AU) - Kernel host: VM devourer-testrig via <user>@<VM-IP> - Cell duration: 10s - Pass threshold: ≥ 3 hits | | TX = devourer | TX = kernel | |---|---|---| | RX = devourer | 0 hits / 4500 TX ✗ | 0 hits / 258 TX ✗ | | RX = kernel | 4172 hits / 4500 TX ✓ | 229 hits / 259 TX ✓ | ``` - **Baseline ✓** kernel-TX 8812 → kernel-RX 8814 inside VM, **~88% delivery** - **devourer-TX validation ✓** devourer-TX 8812 on host → kernel-RX 8814 in VM, **~93% delivery** — confirms devourer's RTL8812AU TX really emits valid frames at the wire level - The two failing cells are the pre-existing devourer 8814 RX TODO, not regressions; cell 3's new \"0 hits / 258 TX\" output correctly fingers the RX side (TX side really did emit 258 frames; devourer-RX 8814 silent) For comparison: the same hardware in local mode from #32's first run got **1 hit** on the devourer-TX→kernel-RX cell because mainline `rtw88_8814au` couldn't probe the chip. The VM with aircrack-ng gives **~4000× the signal**. ## Smaller fixes folded in - TX-count parser surfaces \"Failed to send packet\" failure count separately from the rate-limited `<devourer-tx>` print count (previously misleadingly low when sends were failing) - `--no-baseline-abort` flag for partial-rig diagnostics - `wait_for_wlan_iface` timeout bumped to 20s (kernel rebinds + VM passthrough enumeration take 10s+) - Kernel-TX cells `wait()` for `inject_beacon` to self-terminate instead of killing the ssh wrapper — captures the final \"sent N frames\" line (previously TX count showed 0 even though RX side received frames) ## Usage ```bash sudo tests/setup_vm.sh # ~5-10 min, one-time sudo tests/setup_vm.sh --status sudo python3 tests/regress.py --channel 100 \ --vm-name devourer-testrig \ --vm-ssh <user>@<VM-IP> ``` See [`tests/README.md`](tests/README.md) for full options, prereqs, architecture notes. ## Known limitations (documented in README) - VM mode assumes a single libvirt host running both `virsh` (locally) and the VM. Pulling the VM onto a different host needs your own `virsh` wrapper. - Per matrix run: ~3-4 min in VM mode (USB hot-plug adds ~5s per cell transition vs ~100s for local mode). - Two-adapter scope today. >2 needs a pairing loop in `main()`. - Cell 4 (`devourer-TX → devourer-RX`) needs both DUTs devourer-claimable simultaneously — if one chipset has broken devourer RX (current RTL8814AU TODO), that cell shows 0 regardless of TX. ## Test plan - [x] VM provisioning succeeds end-to-end (`setup_vm.sh` clean run on trainer-arch) - [x] aircrack-ng/rtl8812au DKMS install works inside VM (kernel 5.15) - [x] USB hot-plug of 8814AU into VM works (mainline rtw88 couldn't probe; aircrack-ng claims cleanly) - [x] Full 4-cell matrix runs end-to-end in VM mode - [x] Baseline cell passes (rig sanity) - [x] devourer-TX → kernel-RX cell passes (cross-driver validation) - [x] Failing cells produce diagnostic output (TX count vs RX hits) - [ ] Validate on a different distro / different VM base image - [ ] Validate with a 2× same-chip DUT setup (both cells with both-devourer pass) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this is
A manual-run Python orchestrator that compares devourer's userspace stack against the kernel driver (mainline
rtw88/ out-of-treeaircrack-ng/rtl8812au) on a host with two plugged-in USB Wi-Fi adapters. Emits a markdown table — designed to paste into PR review comments.Each cell injects/receives the canonical beacon (SA
57:42:75:05:d6:00, matchingtxdemo/main.cpp) for--durationseconds and counts hits.Why now
PRs like #30 (RTL8821AU partial bring-up) need cross-driver validation: "does devourer's TX really emit valid frames?" and "can devourer RX a frame the kernel driver knows works?". Running these checks manually is fiddly (modprobe / unbind / iw / tcpdump dance per cell); this script does it in one command and prints a structured result.
This is not a 24x7 CI runner — too few PRs to justify the infrastructure. It's a script the reviewer runs on demand on a test rig.
Usage
See
tests/README.mdfor full options + prereqs.First-run validation on trainer-arch
Arch Linux, kernel 6.x, USB hub with 0bda:8812 (8812AU) + 0bda:8813 (8814AU):
The devourer-TX(8814) → kernel-RX(8812) cell passed — independent confirmation that #29's 8814AU TX bring-up really does land frames on the air. The remaining cells correctly identified the rig's known limitations: mainline
rtw88_8814aucan't probe this 8814AU dongle on this kernel (failed to download firmware, probe error -22), and 8814AU RX is a pre-existing TODO.Portability
which(no/usr/bin/Xhardcoding)iw dev(works for systemdwlp*and classicwlan*)iw,tcpdump,python3-scapy,aircrack-ngVM-readiness
The kernel-cell shell-outs all go through one function (
run_kernel_cmd). Today: local exec. To migrate the kernel driver into a pinned-kernel VM (recommended once host kernel upgrades start breaking the out-of-tree aircrack-ng driver), wrap that function withssh trainer-vm sudoand arrange USB hot-plug passthrough via libvirt. The matrix orchestrator doesn't need to change.Known limitations (documented in README)
main().<devourer-tx>TX #Nprints are rate-limited so when the chip is failing every send, the parser undercounts attempts. Mitigated by surfacing failure count separately in the output.Test plan
aircrack-ng/rtl8812audriver instead of mainline rtw88🤖 Generated with Claude Code